Inducing concatenative units from machine readable dictionaries and corpora for speech synthesis

نویسندگان

  • Judith L. Klavans
  • Evelyne Tzoukermann
چکیده

on the set of diphones is quite straightforward in the sense that it suuces to take the phoneme inventory of a language , and simply combine each phoneme with every other one. For example, taking the approximately 35 French phonemes, 1225 phonemic pairs (35x35) constitute the complete and exhaustive starting diphone inventory. On the other hand, deciding on the set of triphones, quadriphones and larger units raises diicult questions about the nature of phonemes in a given language such as: (1) stability vs instability in a coarticulatory environment, (2) size of overall inventory, and (3) frequency of that unit in the language, in combination with factors (1) and (2). We report on experiments with four diierent databases, with comparisons between the resources regarding their n-gram frequency output. The rst two databases consist of pronunciation eld information from two dictionaries, the Encyclopedic Robert French dictionary 16] with 85,000 headwords, and the smaller Collins Gem 13] containing 15,000 words. For comparison, we use two text corpora, the Hansard (about 2.5 million words) and the smaller Tubach and Boe 31] corpus (80,000 words); both corpora were processed by a set of grapheme-to-phoneme rules 18]. A frequency extraction program was applied to all four resources to extract trigram phonemic frequencies; this serves as a basis for comparison between dictionary derived data and corpus derived frequencies.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Machine-Readable Dictionaries in Text-to-Speech Systems

This paper presents the results of an experiment usiug machine-readable dictionaries (Mill)s) and corpora for building concatenativc units for text to speech (T'PS) systems. Theoretical questions concerning the nature of t)honemic data in dictionaries are raised; phonemic dictionary data is viewed as a representative corpus over which to extract n-gram phonemic frequencies in the language. Dict...

متن کامل

Design of Optimal Slovenian Speech Corpus for Use in the Concatenative Speech Synthesis System

In the paper the development of Slovenian speech corpus for use in concatenative speech synthesis system being developed at University of Maribor, Slovenia, will be presented. The emphasis in the paper is the issue of maximising the usefulness of the defined speech corpus for concatenation purposes. Usefulness of the speech corpus very much depends on the corresponding text and can be increased...

متن کامل

Development of an English-Macedonian Machine Readable Dictionary by Using Parallel Corpora

The dictionaries are one of the most useful lexical resources. However, most of the dictionaries today are not in digital form. This makes them cumbersome for usage by humans and impossible for integration in computer programs. The process of digitalizing an existing traditional dictionary is expensive and labor intensive task. In this paper, we present a method for development of Machine Reada...

متن کامل

IRWIN AND JOAN JACOBS CENTER FOR COMMUNICATION AND INFORMATION TECHNOLOGIES A Hybrid Text-to-Speech System that Combines Concatenative and Statistical Synthesis Units

Concatenative synthesis and statistical synthesis are the two main approaches to text-to-speech (TTS) synthesis. Concatenative TTS (CTTS) stores natural speech features segments, selected from a recorded speech database. Consequently, CTTS systems enable speech synthesis with natural quality. However, as the footprint of the stored data is reduced, desired segments are not always available in t...

متن کامل

ACTOR: A multilingual unit-selection speech synthesis system

The ACTOR® Text-To-Speech (TTS) synthesis system, developed at Loquendo S.p.A., is here described. The system employs a unit -selection concatenative synthesis technique, relying on labeled acoustic databases providing phonetic and prosodic coverage of the intended language/domain and on an original algorithm for run-time selection of the acoustic units to be concatenated. This technique yields...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994